Siamese Visual Tracking With Residual Fusion Learning
نویسندگان
چکیده
Multi-stage feature fusion is pretty effective for deep Siamese trackers to promote tracking performance. Unfortunately, conventional approaches, such as weighted average, are so simple that they inappropriate combine the features with diverse characteristics. In addition, module generally optimized along network module, which may result in performance degradation of whole tracker. this paper, we propose a novel tracker by exploiting expression capacity residual learning (SiamRFL). Specifically, employs deep-layer direct input semantically recognize object from background, and refines state local detail patterns exploring shallow-layer through channel. The classification regression can be fused respectively deploying multiple units. To avoid problem, also present an ensemble training framework our tracker, different loss functions introduced individually optimize modules. Compared baseline SiamRPN++ proposed achieves favorable gains $0.696\rightarrow 0.709$ , notation="LaTeX">$0.285\rightarrow 0.308$ notation="LaTeX">$0.603\rightarrow 0.624$ notation="LaTeX">$0.496\rightarrow 0.520$ notation="LaTeX">$0.517\rightarrow 0.559$ on OTB100, VOT2019, UAV123, LaSOT GOT10k datasets, outperforming other approaches obvious margin.
منابع مشابه
Siamese Learning Visual Tracking: A Survey
The aim of this survey is the attempt to review the kind of machine learning and stochastic techniques and the ways existing work currently uses machine learning and stochastic methods for the challenging problem of visual tracking. It is not intended to study the whole tracking literature of the last decades as this seems rather impossible by the incredible vast number of published papers. Thi...
متن کاملFusion with Diffusion for Robust Visual Tracking
A weighted graph is used as an underlying structure of many algorithms like semisupervised learning and spectral clustering. If the edge weights are determined by a single similarity measure, then it hard if not impossible to capture all relevant aspects of similarity when using a single similarity measure. In particular, in the case of visual object matching it is beneficial to integrate diffe...
متن کاملLearning multiple visual domains with residual adapters
There is a growing interest in learning data representations that work well for many different types of problems and data. In this paper, we look in particular at the task of learning a single visual representation that can be successfully utilized in the analysis of very different types of images, from dog breeds to stop signs and digits. Inspired by recent work on learning networks that predi...
متن کاملLearning Text Similarity with Siamese Recurrent Networks
This paper presents a deep architecture for learning a similarity metric on variablelength character sequences. The model combines a stack of character-level bidirectional LSTM’s with a Siamese architecture. It learns to project variablelength strings into a fixed-dimensional embedding space by using only information about the similarity between pairs of strings. This model is applied to the ta...
متن کاملRobust Visual Tracking Based on Feature Fusion ?
To drive computational visual tracking toward more robust outputs, we need a more accurate and adaptive feature representation of target. In this paper, we propose a tracking algorithm using new multifeature statistical observation model based on the particle filter framework. Four complementary features are described with histograms and fused in a novel way. We demonstrate how to establish obs...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: IEEE Access
سال: 2022
ISSN: ['2169-3536']
DOI: https://doi.org/10.1109/access.2021.3134066